A three-network architecture for on-line learning and optimization based on adaptive dynamic programming

نویسندگان

Haibo He

Zhen Ni

Jian Fu

چکیده

In this paper, we propose a novel adaptive dynamic programming (ADP) architecture with three networks, an action network, a critic network, and a reference network, to develop internal goalrepresentation for online learning and optimization. Unlike the traditional ADP design normally with an action network and a critic network, our approach integrates the third network, a reference network, into the actor-critic design framework to automatically and adaptively build an internal reinforcement signal to facilitate learning and optimization overtime to accomplish goals. We present the detailed design architecture and its associated learning algorithm to explain how effective learning and optimization can be achieved in this new ADP architecture. Furthermore, we test the performance of our architecture both on the cart-pole balancing task and the triple-link inverted pendulum balancing task, which are the popular benchmarks in the community to demonstrate its learning and control performance over time. & 2011 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Controller Design with ANFIS Architecture Attendant Learning Ability for SSSC-Based Damping Controller Applied in Single Machine Infinite Bus System

Static Synchronous Series Compensator (SSSC) is a series compensating Flexible AC Transmission System (FACTS) controller for maintaining to the power flow control on a transmission line by injecting a voltage in quadrature with the line current and in series mode with the line. In this work, an Adaptive Network-based Fuzzy Inference System controller (ANFISC) has been proposed for controlling o...

متن کامل

Adaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network

An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...

متن کامل

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise chang...

متن کامل

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...

متن کامل

A New Fuzzy Stabilizer Based on Online Learning Algorithm for Damping of Low-Frequency Oscillations

A multi objective Honey Bee Mating Optimization (HBMO) designed by online learning mechanism is proposed in this paper to optimize the double Fuzzy-Lead-Lag (FLL) stabilizer parameters in order to improve low-frequency oscillations in a multi machine power system. The proposed double FLL stabilizer consists of a low pass filter and two fuzzy logic controllers whose parameters can be set by the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Neurocomputing

دوره 78 شماره

صفحات -

تاریخ انتشار 2012

A three-network architecture for on-line learning and optimization based on adaptive dynamic programming

نویسندگان

چکیده

منابع مشابه

A Controller Design with ANFIS Architecture Attendant Learning Ability for SSSC-Based Damping Controller Applied in Single Machine Infinite Bus System

Adaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

A New Fuzzy Stabilizer Based on Online Learning Algorithm for Damping of Low-Frequency Oscillations

عنوان ژورنال:

اشتراک گذاری